3,802 research outputs found

    These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure

    Full text link
    K-mer abundance analysis is widely used for many purposes in nucleotide sequence analysis, including data preprocessing for de novo assembly, repeat detection, and sequencing coverage estimation. We present the khmer software package for fast and memory efficient online counting of k-mers in sequencing data sets. Unlike previous methods based on data structures such as hash tables, suffix arrays, and trie structures, khmer relies entirely on a simple probabilistic data structure, a Count-Min Sketch. The Count-Min Sketch permits online updating and retrieval of k-mer counts in memory which is necessary to support online k-mer analysis algorithms. On sparse data sets this data structure is considerably more memory efficient than any exact data structure. In exchange, the use of a Count-Min Sketch introduces a systematic overcount for k-mers; moreover, only the counts, and not the k-mers, are stored. Here we analyze the speed, the memory usage, and the miscount rate of khmer for generating k-mer frequency distributions and retrieving k-mer counts for individual k-mers. We also compare the performance of khmer to several other k-mer counting packages, including Tallymer, Jellyfish, BFCounter, DSK, KMC, Turtle and KAnalyze. Finally, we examine the effectiveness of profiling sequencing error, k-mer abundance trimming, and digital normalization of reads in the context of high khmer false positive rates. khmer is implemented in C++ wrapped in a Python interface, offers a tested and robust API, and is freely available under the BSD license at github.com/ged-lab/khmer

    Flow Dynamics And Plasma Heating Of Spheromaks In SSX

    Get PDF
    We report several new experimental results related to flow dynamics and heating from single dipole-trapped spheromaks and spheromak merging studies at SSX. Single spheromaks (stabilized with a pair of external coils, see Brown, Phys. Plasmas 13 102503 (2006)) and merged FRC-like configurations (see Brown, Phys. Plasmas 13, 056503 (2006)) are trapped in our prolate (R = 0.2 m, L = 0.6 m) copper flux conserver. Local spheromak flow is studied with two Mach probes (r(1) = rho(i) ) calibrated by time-of-flight with a fast set of magnetic probes at the edge of the device. Both Mach probes feature six ion collectors housed in a boron nitride sheath. The larger Mach probe will ultimately be used in the MST reversed field pinch. Line averaged flow is measured by ion Doppler spectroscopy (IDS) at the midplane. The SSX IDS instrument measures with 1 mu s or better time resolution the width and Doppler shift of the C-III impurity (H plasma) 229.7 nm line to determine the temperature and line-averaged flow velocity (see Cothran, RSI 77, 063504 (2006)). We find axial flows up to 100 km/s during formation of the dipole trapped spheromak. Flow returns at the wall to form a large vortex. Recent high-resolution IDS velocity measurements during spheromak merging show bi-directional outflow jets at +/- 40 km/s (nearly the Alfven speed). We also measure T-i \u3e= 80 eV and T-e \u3e= 20 eV during spheromak merging events after all plasma facing surfaces are cleaned with helium glow discharge conditioning. Transient electron heating is inferred from bursts on a four-channel soft x-ray array. The spheromaks are also characterized by a suite of magnetic probe arrays for magnetic structure B(r,t), and interferometry for n(e) . Finally, we are designing a new oblate, trapezoidal flux conserver for FRC studies. Equilibrium and dynamical simulations suggest that a tilt-stable, oblate FRC can be formed by spheromak merging in the new flux conserver

    Digital Weapons of Mass Destablization

    Get PDF
    In the coming decade, a global proliferation of networked technologies will widen the cyber threat landscape. Pairing new and unforeseen cyber vulnerabilities with weapons of mass destruction (WMD) increases the secondary threats that cyber attacks bring and also necessitates a shift in definitions. WMD will become weapons of mass destabilization, allowing adversaries to gain strategic advantage in novel ways. Altering this definition provides clarity and specific actions that can be taken to disrupt, mitigate and recover from this combined threat. Additionally, a new class of Digital WMD (DWMD) will emerge, threatening military, government, and civilian targets worldwide. These combined and new threats will require the expansion of current defensive or mitigation activities, partnerships, and preparationhttps://digitalcommons.usmalibrary.org/aci_books/1035/thumbnail.jp
    • …
    corecore